{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "q1dEw-XVkzUT"
   },
   "source": [
    "# Gradient Boosting: CPU vs GPU\n",
    "This is a basic tuturoal which shows how to run gradient boosting on CPU and GPU on Google Colaboratory. It will give you and opportunity to see the speedup that you get from GPU training. The speedup is large even on Tesla K80 that is available in Colaboratory. On newer generations of GPU the speedup will be much bigger.\n",
    "\n",
    "We will use CatBoost gradient boosting library, which is known for it's good GPU performance.\n",
    "  \n",
    " You could try it out on Colaboratory, just pressing on the following badge:  \n",
    " \n",
    "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/catboost/tutorials/blob/master/tools/google_colaboratory_cpu_vs_gpu_tutorial.ipynb) "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "wPOIzC8amcEk"
   },
   "source": [
    "## Set GPU as hardware accelerator\n",
    "First of all, you need to select GPU as hardware accelerator. There are two simple steps to do so:  \n",
    "Step 1. Navigate to 'Runtime' menu and select 'Change runtime type'  \n",
    "Step 2. Choose GPU as hardware accelerator.  \n",
    "That's all!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "SvkCBNqRkX1t"
   },
   "source": [
    "## Importing CatBoost\n",
    "\n",
    "Next big thing is to import CatBoost inside environment. Colaboratory has built in libraries installed and most libraries can be installed quickly with a simple *!pip install* command.  \n",
    "Please ignore the warning message about already imported enum package. Furthermore take note that you need to re-import the library every time you start a new session of Colab."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 342
    },
    "colab_type": "code",
    "executionInfo": {
     "elapsed": 15323,
     "status": "ok",
     "timestamp": 1550052132809,
     "user": {
      "displayName": "Sergey Brazhnik",
      "photoUrl": "https://lh3.googleusercontent.com/-qx9xOxVVPXo/AAAAAAAAAAI/AAAAAAAAA0Q/8-j43i6dgow/s64/photo.jpg",
      "userId": "09427549251805676502"
     },
     "user_tz": -420
    },
    "id": "5A8e1XP3kqTx",
    "outputId": "bdb5c80e-fb0b-437d-b46f-b72b665f7a3f"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Collecting catboost\n",
      "\u001b[?25l  Downloading https://files.pythonhosted.org/packages/98/03/777a0e1c12571a7f3320a4fa6d5f123dba2dd7c0bca34f4f698a6396eb48/catboost-0.12.2-cp36-none-manylinux1_x86_64.whl (55.5MB)\n",
      "\u001b[K    100% |████████████████████████████████| 55.5MB 853kB/s \n",
      "\u001b[?25hCollecting enum34 (from catboost)\n",
      "  Downloading https://files.pythonhosted.org/packages/af/42/cb9355df32c69b553e72a2e28daee25d1611d2c0d9c272aa1d34204205b2/enum34-1.1.6-py3-none-any.whl\n",
      "Requirement already satisfied: pandas>=0.19.1 in /usr/local/lib/python3.6/dist-packages (from catboost) (0.22.0)\n",
      "Requirement already satisfied: six in /usr/local/lib/python3.6/dist-packages (from catboost) (1.11.0)\n",
      "Requirement already satisfied: numpy>=1.11.1 in /usr/local/lib/python3.6/dist-packages (from catboost) (1.14.6)\n",
      "Requirement already satisfied: python-dateutil>=2 in /usr/local/lib/python3.6/dist-packages (from pandas>=0.19.1->catboost) (2.5.3)\n",
      "Requirement already satisfied: pytz>=2011k in /usr/local/lib/python3.6/dist-packages (from pandas>=0.19.1->catboost) (2018.9)\n",
      "Installing collected packages: enum34, catboost\n",
      "Successfully installed catboost-0.12.2 enum34-1.1.6\n"
     ]
    },
    {
     "data": {
      "application/vnd.colab-display-data+json": {
       "pip_warning": {
        "packages": [
         "enum"
        ]
       }
      }
     },
     "metadata": {
      "tags": []
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "!pip install catboost"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "UKnT_rYWWhD8"
   },
   "source": [
    "## Download and prepare dataset\n",
    "The next step is dataset downloading. GPU training is useful for large datsets. You will get a good speedup starting from 10k objects and the more objects you have, the more will be the speedup.\n",
    "Because of that reason we have selected a large dataset - [Epsilon](https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html) (500.000 documents and 2.000 features) for this tutorial.\n",
    "Firstly, we will get the data through catboost.datasets module. The code below does this. It will run for approximately 10-15 minutes. So please be patient :)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "lzWG0GeAHVg9"
   },
   "outputs": [],
   "source": [
    "from catboost.datasets import epsilon\n",
    "\n",
    "train, test = epsilon()\n",
    "\n",
    "X_train, y_train = train.iloc[:,1:], train[0]\n",
    "X_test, y_test = test.iloc[:,1:], test[0]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "5JhczRSqBFnd"
   },
   "source": [
    "## Training on CPU\n",
    "Now we will train the model on CPU and measure execution time.\n",
    "We will use 100 iterations for our CPU training since otherwise it will take a long time.\n",
    "It will take around 15 minutes."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 305
    },
    "colab_type": "code",
    "executionInfo": {
     "elapsed": 889091,
     "status": "ok",
     "timestamp": 1550053912020,
     "user": {
      "displayName": "Sergey Brazhnik",
      "photoUrl": "https://lh3.googleusercontent.com/-qx9xOxVVPXo/AAAAAAAAAAI/AAAAAAAAA0Q/8-j43i6dgow/s64/photo.jpg",
      "userId": "09427549251805676502"
     },
     "user_tz": -420
    },
    "id": "3wuVZsTGC7WO",
    "outputId": "c111f469-266c-4ee8-a860-e1d4f0bcf1a6"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0:\tlearn: 0.6878003\ttest: 0.6878003\tbest: 0.6878003 (0)\ttotal: 7.64s\tremaining: 12m 36s\n",
      "10:\tlearn: 0.6460415\ttest: 0.6460415\tbest: 0.6460415 (10)\ttotal: 1m 20s\tremaining: 10m 53s\n",
      "20:\tlearn: 0.6169323\ttest: 0.6169323\tbest: 0.6169323 (20)\ttotal: 2m 30s\tremaining: 9m 25s\n",
      "30:\tlearn: 0.5948178\ttest: 0.5948178\tbest: 0.5948178 (30)\ttotal: 3m 38s\tremaining: 8m 5s\n",
      "40:\tlearn: 0.5766751\ttest: 0.5766751\tbest: 0.5766751 (40)\ttotal: 4m 44s\tremaining: 6m 49s\n",
      "50:\tlearn: 0.5615754\ttest: 0.5615754\tbest: 0.5615754 (50)\ttotal: 5m 48s\tremaining: 5m 35s\n",
      "60:\tlearn: 0.5484313\ttest: 0.5484313\tbest: 0.5484313 (60)\ttotal: 6m 52s\tremaining: 4m 23s\n",
      "70:\tlearn: 0.5370070\ttest: 0.5370070\tbest: 0.5370070 (70)\ttotal: 7m 55s\tremaining: 3m 14s\n",
      "80:\tlearn: 0.5265792\ttest: 0.5265792\tbest: 0.5265792 (80)\ttotal: 8m 58s\tremaining: 2m 6s\n",
      "90:\tlearn: 0.5173778\ttest: 0.5173778\tbest: 0.5173778 (90)\ttotal: 10m 4s\tremaining: 59.8s\n",
      "99:\tlearn: 0.5097820\ttest: 0.5097820\tbest: 0.5097820 (99)\ttotal: 11m 2s\tremaining: 0us\n",
      "\n",
      "bestTest = 0.5097819713\n",
      "bestIteration = 99\n",
      "\n",
      "Time to fit model on CPU: 888 sec\n"
     ]
    }
   ],
   "source": [
    "from catboost import CatBoostClassifier\n",
    "import timeit\n",
    "\n",
    "def train_on_cpu():  \n",
    "  model = CatBoostClassifier(\n",
    "    iterations=100,\n",
    "    learning_rate=0.03\n",
    "  )\n",
    "  \n",
    "  model.fit(\n",
    "      X_train, y_train,\n",
    "      eval_set=(X_test, y_test),\n",
    "      verbose=10\n",
    "  );   \n",
    "      \n",
    "cpu_time = timeit.timeit('train_on_cpu()', \n",
    "                         setup=\"from __main__ import train_on_cpu\", \n",
    "                         number=1)\n",
    "\n",
    "print('Time to fit model on CPU: {} sec'.format(int(cpu_time)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "m0EafIhX-SgH"
   },
   "source": [
    "Take notice that learning time itself wothout data feeding is around 12 minutes. Whereas all the process consumes 14-15 min."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "C1jsWgxhCLTX"
   },
   "source": [
    "## Training on GPU\n",
    "The previous code execution has been done on CPU. It's time to use GPU!  \n",
    "We need to use '*task_type='GPU'*' parameter value to run GPU training. Now the execution time wouldn't be so big :)  \n",
    "BTW if Colaboratory shows you a warning 'GPU memory usage is close to the limit', just press 'Ignore'."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 287
    },
    "colab_type": "code",
    "executionInfo": {
     "elapsed": 195571,
     "status": "ok",
     "timestamp": 1550054220603,
     "user": {
      "displayName": "Sergey Brazhnik",
      "photoUrl": "https://lh3.googleusercontent.com/-qx9xOxVVPXo/AAAAAAAAAAI/AAAAAAAAA0Q/8-j43i6dgow/s64/photo.jpg",
      "userId": "09427549251805676502"
     },
     "user_tz": -420
    },
    "id": "Oq4Lpx1veXnI",
    "outputId": "dc367757-c05e-48be-9e20-7c69d43e9655"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0:\tlearn: 0.6877673\ttest: 0.6877673\tbest: 0.6877673 (0)\ttotal: 335ms\tremaining: 33.1s\n",
      "10:\tlearn: 0.6457423\ttest: 0.6457424\tbest: 0.6457424 (10)\ttotal: 2.54s\tremaining: 20.6s\n",
      "20:\tlearn: 0.6163271\ttest: 0.6163271\tbest: 0.6163271 (20)\ttotal: 4.61s\tremaining: 17.3s\n",
      "30:\tlearn: 0.5943045\ttest: 0.5943045\tbest: 0.5943045 (30)\ttotal: 6.59s\tremaining: 14.7s\n",
      "40:\tlearn: 0.5763315\ttest: 0.5763315\tbest: 0.5763315 (40)\ttotal: 8.56s\tremaining: 12.3s\n",
      "50:\tlearn: 0.5607702\ttest: 0.5607704\tbest: 0.5607704 (50)\ttotal: 10.5s\tremaining: 10.1s\n",
      "60:\tlearn: 0.5478195\ttest: 0.5478195\tbest: 0.5478195 (60)\ttotal: 12.4s\tremaining: 7.95s\n",
      "70:\tlearn: 0.5360011\ttest: 0.5360011\tbest: 0.5360011 (70)\ttotal: 14.3s\tremaining: 5.83s\n",
      "80:\tlearn: 0.5258044\ttest: 0.5258044\tbest: 0.5258044 (80)\ttotal: 16.2s\tremaining: 3.79s\n",
      "90:\tlearn: 0.5165437\ttest: 0.5165438\tbest: 0.5165438 (90)\ttotal: 18.1s\tremaining: 1.79s\n",
      "99:\tlearn: 0.5089656\ttest: 0.5089656\tbest: 0.5089656 (99)\ttotal: 19.7s\tremaining: 0us\n",
      "bestTest = 0.5089655859\n",
      "bestIteration = 99\n",
      "Time to fit model on GPU: 195 sec\n",
      "GPU speedup over CPU: 4.56x\n"
     ]
    }
   ],
   "source": [
    "def train_on_gpu():  \n",
    "  model = CatBoostClassifier(\n",
    "    iterations=100,\n",
    "    learning_rate=0.03,\n",
    "    task_type='GPU'\n",
    "  )\n",
    "  \n",
    "  model.fit(\n",
    "      X_train, y_train,\n",
    "      eval_set=(X_test, y_test),\n",
    "      verbose=10\n",
    "  );     \n",
    "      \n",
    "gpu_time = timeit.timeit('train_on_gpu()', \n",
    "                         setup=\"from __main__ import train_on_gpu\", \n",
    "                         number=1)\n",
    "\n",
    "print('Time to fit model on GPU: {} sec'.format(int(gpu_time)))\n",
    "print('GPU speedup over CPU: ' + '%.2f' % (cpu_time/gpu_time) + 'x')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "fp4-J4YlbDxF"
   },
   "source": [
    "As you can see GPU is much faster than CPU on large datasets. It takes just 3-4 mins vs 14-15 mins to fit the model. Moreover learning process consumes just 30 seconds vs 12 minutes! This is a good reason to use GPU instead of CPU!\n",
    "  \n",
    "Thank you for attention! "
   ]
  }
 ],
 "metadata": {
  "accelerator": "GPU",
  "colab": {
   "collapsed_sections": [],
   "name": "colaboratory_cpu_vs_gpu_tutorial.ipynb",
   "provenance": [
    {
     "file_id": "1iLzXm9f0oWYkukSgAh-9silP2T9MpERH",
     "timestamp": 1545215978704
    },
    {
     "file_id": "1yFVSjdR2nIWeQvOd5-rMdtLRHcI3lx-0",
     "timestamp": 1545214576187
    }
   ],
   "toc_visible": true,
   "version": "0.3.2"
  },
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
